ConcurrentHashMap 详解

在 Java 中经常使用 HashMap,但是它不是线程安全的。在多线程的情况下，会出现数据丢失的问题，如果 JDK 版本小于 1.8，还会出现死循环的问题(死循环的原因可参考链接)。多线程的场景下建议使用 ConcurrentHashMap。

ConcurrentHashMap 结构图如下:

(图片来自网络，侵删)

ConcurrentHashMap 在 JDK 1.8 前采用分段锁的设计思想，相比 HashTable 直接对读写方法加锁，ConcurrentHashMap 有多把锁，每把锁用于锁住容器中一段数据，当一段数据被一个线程锁住时，其他线程可以访问容器中其他段的数据，有效地提升了并发访问的效率。从 1.8 版本开始，ConcurrentHashMap 放弃了分段锁的设计，底层数据结构为数组+链表+红黑树，通过 volatile、CAS、synchronized控制并发。

下面分析一下 JDK 1.8 版本的 ConcurrentHashMap 的源码。

1
2
3

//使用该构造函数，table 默认大小为 16
public ConcurrentHashMap() {
}

通过 put 方法往容器中插入数据：

final V putVal(K key, V value, boolean onlyIfAbsent) {
		// ConcurrentHashMap 不允许 key 和 value 为空
    if (key == null || value == null) throw new NullPointerException();
    // 计算 hash 值
    int hash = spread(key.hashCode());
    int binCount = 0;
    for (Node<K,V>[] tab = table;;) {
        Node<K,V> f; int n, i, fh;
        //如果 tab 为空，初始化 Table
        if (tab == null || (n = tab.length) == 0)
            tab = initTable();
        //定位索引位置上的 Node 节点 f ，如果为 null，在该位置上插入该元素。casTabAt 利用 Unsafe.compareAndSwapObject 方法插入 Node 节点. Unsafe.compareAndSwapObjec可以理解为一个原子操作，在JNI里是借助于一个CPU指令完成的。
        //table 被 volatile 修饰，使用 tabAt 方法而非使用 tab[i] 的方式获取第 i 个元素的原因是 volatile 特性不支持数组元素，使用 U.getObjectVolatile 直接获取内存则保证了数组元素是最新的
        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
        		//如果 CAS 操作成功则跳出循环执行 addCount 方法，如果失败说明已被其他线程插入元素，需要继续循环
            if (casTabAt(tab, i, null,
                         new Node<K,V>(hash, key, value, null)))
                break;                   // no lock when adding to empty bin
        }
        //如果f的hash值为-1，说明当前f是ForwardingNode节点，意味有其它线程正在扩容
        else if ((fh = f.hash) == MOVED)
            tab = helpTransfer(tab, f);
        else {
            V oldVal = null;
            // 使用 synchronized 锁住 f,相比 HashTable 锁住整个容器，锁的粒度变小了
            synchronized (f) {
                if (tabAt(tab, i) == f) {
                    if (fh >= 0) {
                        binCount = 1;
                        for (Node<K,V> e = f;; ++binCount) {
                            K ek;
                            if (e.hash == hash &&
                                ((ek = e.key) == key ||
                                 (ek != null && key.equals(ek)))) {
                                //链表已存在对应的节点，更新 val 值
                                oldVal = e.val;
                                if (!onlyIfAbsent)
                                    e.val = value;
                                break;
                            }
                            Node<K,V> pred = e;
                            if ((e = e.next) == null) {
                            		//链表中不存在对应节点，将元素插入到链表尾部
                                pred.next = new Node<K,V>(hash, key,
                                                          value, null);
                                break;
                            }
                        }
                    }
                    else if (f instanceof TreeBin) {
                        Node<K,V> p;
                        binCount = 2;
                        // 在红黑树上面更新或者新增节点
                        if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                       value)) != null) {
                            oldVal = p.val;
                            if (!onlyIfAbsent)
                                p.val = value;
                        }
                    }
                }
            }
            if (binCount != 0) {
                if (binCount >= TREEIFY_THRESHOLD)
                		//把链表转为红黑树
                    treeifyBin(tab, i);
                if (oldVal != null)
                    return oldVal;
                break;
            }
        }
    }
    //更新数量、判断是否需要扩容
    addCount(1L, binCount);
    return null;
}

initTable 方法对 table 初始化，它借助了 CAS 来保证多线程情况下不重复初始化:

private final Node<K,V>[] initTable() {
    Node<K,V>[] tab; int sc;
    while ((tab = table) == null || tab.length == 0) {
    		// 如果一个线程发现 sizeCtl 小于0，说明另一个线程成功执行了 CAS 操作，此线程需要让出 CPU 时间片
        if ((sc = sizeCtl) < 0)
            Thread.yield(); // lost initialization race; just spin
        // SIZECTL 值为-1的时候代表正在初始化
        else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
            try {
                if ((tab = table) == null || tab.length == 0) {
                    int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
                    @SuppressWarnings("unchecked")
                    Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n];
                    table = tab = nt;
                    sc = n - (n >>> 2);
                }
            } finally {
                sizeCtl = sc;
            }
            break;
        }
    }
    return tab;
}

重点看一下 addCount 方法：

private final void addCount(long x, int check) {
		// 容器容量 size 为 baseCount 和 CounterCell 数组各个元素 value 的和。多个线程可同时更新不同 CounterCell 的值。当需要获取节点总数时，只需要把全部加起来即可
    CounterCell[] as; long b, s;
    if ((as = counterCells) != null ||
        !U.compareAndSwapLong(this, BASECOUNT, b = baseCount, s = b + x)) {
        CounterCell a; long v; int m;
        boolean uncontended = true;
        if (as == null || (m = as.length - 1) < 0 ||
            (a = as[ThreadLocalRandom.getProbe() & m]) == null ||
            !(uncontended =
              U.compareAndSwapLong(a, CELLVALUE, v = a.value, v + x))) {
            fullAddCount(x, uncontended);
            return;
        }
        if (check <= 1)
            return;
        s = sumCount();
    }
    if (check >= 0) {
        Node<K,V>[] tab, nt; int n, sc;
        while (s >= (long)(sc = sizeCtl) && (tab = table) != null &&
               (n = tab.length) < MAXIMUM_CAPACITY) {
            int rs = resizeStamp(n);
            if (sc < 0) {
                if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
                    sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
                    transferIndex <= 0)
                    break;
                if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
                		//其他线程可以协助扩容
                    transfer(tab, nt);
            }
            else if (U.compareAndSwapInt(this, SIZECTL, sc,
                                         (rs << RESIZE_STAMP_SHIFT) + 2))
                //扩容的时候只能有一个线程扩容，保证内存空间只有一份                         
                transfer(tab, null);
            s = sumCount();
        }
    }
}

更新节点总数和扩容的思路有点类似，都允许多个线程分段操作，而不是对整个容器加锁导致只有一个线程操作、其他线程阻塞等待。这种分段的设计思想提高了并发效率。

get 方法比较简单:

public V get(Object key) {
    Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
    int h = spread(key.hashCode());
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (e = tabAt(tab, (n - 1) & h)) != null) {
        if ((eh = e.hash) == h) {
        		// 如果在桶上则直接返回
            if ((ek = e.key) == key || (ek != null && key.equals(ek)))
                return e.val;
        }
        else if (eh < 0)
        		// 从红黑树中查找
            return (p = e.find(h, key)) != null ? p.val : null;
        while ((e = e.next) != null) {
        		// 从链表中查找
            if (e.hash == h &&
                ((ek = e.key) == key || (ek != null && key.equals(ek))))
                return e.val;
        }
    }
    return null;
}

参考链接:

Java魔法类：Unsafe应用解析

不可不说的Java“锁”事

深入浅出ConcurrentHashMap1.8

Java volatile array?

ConcurrentHashMap竟然还能挖出这些东西！